Single-Producer/Single-Consumer Queues on Shared Cache Multi-Core Systems

نویسنده

  • Massimo Torquati
چکیده

Using efficient point-to-point communication channels is critical for implementing fine grained parallel program on modern shared cache multicore architectures. This report discusses in detail several implementations of wait-free Single-Producer/Single-Consumer queue (SPSC), and presents a novel and efficient algorithm for the implementation of an unbounded wait-free SPSC queue (uSPSC). The correctness proof of the new algorithm, and several performance measurements based on simple synthetic benchmark and microbenchmark, are also discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Synchronisation Mechanism for Multi-Core Systems

The use of efficient synchronization mechanisms is crucial for implementing fine grained parallel programs on modern shared cache multi-core architectures. In this paper we study this problem by considering Single-Producer/Single-Consumer (SPSC) coordination using unbounded queues. A novel unbounded SPSC algorithm capable of reducing the row synchronization latency and speeding up Producer-Cons...

متن کامل

Single-Consumer Queues on Shared Cache Multi-Core Systems

Using efficient point-to-point communication channels is critical for implementing fine grained parallel program on modern shared cache multicore architectures. This report discusses in detail several implementations of wait-free Single-Producer/Single-Consumer queue (SPSC), and presents a novel and efficient algorithm for the implementation of an unbounded wait-free SPSC queue (uSPSC). The cor...

متن کامل

A Proof of Concept for Optimizing Task Parallelism by Locality Queues

Task parallelism as employed by the OpenMP task construct, although ideal for tackling irregular problems or typical producer/consumer schemes, bears some potential for performance bottlenecks if locality of data access is important, which is typically the case for memory-bound code on ccNUMA systems. We present a programming technique which ameliorates adverse effects of dynamic task distribut...

متن کامل

Optimizing a Multi-Core Processor for Message-Passing Workloads

Future large-scale multi-cores will likely be best suited for use within high-performance computing (HPC) domains. A large fraction of HPC workloads employ the messagepassing interface (MPI), yet multi-cores continue to be optimized for shared-memory workloads. In this position paper, we put forth the design of a unique chip that is optimized for MPI workloads. It introduces specialized hardwar...

متن کامل

Scaling software on multi-core through co-scheduling of related tasks

Ever increasing demand for more processing power, coupled with problems in designing higher frequency chips are forcing CPU vendors to take the multi-core route. IBM R © introduced the first multi-core processor with its POWER4 R © in 2001, that had two cores in a chip and also 4 chips in a package. Other CPU vendors have followed the trend with dual and quad-core processors becoming increasing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1012.1824  شماره 

صفحات  -

تاریخ انتشار 2010